Search CORE

Recommended from our members

Method51 for mining insight from social media datasets

Author: Reffin Jeremy
Weir David
Wibberley Simon
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 23/08/2014
Field of study

We present Method51, a social media analysis software platform with a set of accompanying methodologies. We discuss a series of case studies illustrating the platform’s application, and motivating our methodological proposals

Learning to distinguish hypernyms and co-hyponyms

Author: Clarke Daoud
Keller Bill
Reffin Jeremy
Weeds Julie
Weir David
Publication venue: Dublin City University and Association for Computational Linguistics
Publication date: 01/08/2014
Field of study

This work is concerned with distinguishing different semantic relations which exist between distributionally similar words. We compare a novel approach based on training a linear Support Vector Machine on pairs of feature vectors with state-of-the-art methods based on distributional similarity. We show that the new supervised approach does better even when there is minimal information about the target words in the training data, giving a 15% reduction in error rate over unsupervised approaches

Analysis and Policy Observatory (APO)

Anti-social media

Author: Jamie Bartlett
Jeremy Reffin
Noelle Rumball
Sarah Williamson
Publication venue: Demos
Publication date
Field of study

To inform the discussion over free speech and hate speech, this study examines the way racial, religious and ethnic slurs are employed on Twitter. Executive summary: How to define the limits of free speech is a central debate in most modern democracies. This is particularly difficult in relation to hateful, abusive and racist speech. The pattern of hate speech is complex. But there is increasing focus on the volume and nature of hateful or racist speech taking place online; and new modes of communication mean it is easier than ever to find and capture this type of language. How and whether to respond to certain types of language use without curbing freedom of expression in this online space is a significant question for policy makers, civil society groups, law enforcement agencies and others. This short study aims to inform these difficult decisions by examining specifically the way racial and ethnic slurs (henceforth, ‘slurs’) are used on the popular microblogging site, Twitter. Slurs relate specifically to a set of words, terms, or nicknames which are used to refer to groups in a society in a derogatory, pejorative or insulting manner. Slurs can be used in a hateful way, but that is not always the case. Therefore, this research is not about hate speech per se, but about epistemology and linguistics: word use and meaning. In this study, we aim to answer two following questions: (a) In what ways are slurs being used on Twitter, and in what volume? (b) What is the potential for automated machine learning techniques to accurately identify and classify slurs

Aligning packed dependency trees: a theory of composition for distributional semantics

Author: Kober Thomas
Reffin Jeremy
Weeds Julie
Weir David
Publication venue: 'MIT Press - Journals'
Publication date: 25/08/2016
Field of study

We present a new framework for compositional distributional semantics in which the distributional contexts of lexemes are expressed in terms of anchored packed dependency trees. We show that these structures have the potential to capture the full sentential contexts of a lexeme and provide a uniform basis for the composition of distributional knowledge in a way that captures both mutual disambiguation and generalization

arXiv.org e-Print Archive

arXiv.org e-Print Archive

Improving Semantic Composition with Offset Inference

Author: Kober Thomas
Reffin Jeremy
Weeds Julie
Weir David
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2017
Field of study

Count-based distributional semantic models suffer from sparsity due to unobserved but plausible co-occurrences in any text collection. This problem is amplified for models like Anchored Packed Trees (APTs), that take the grammatical type of a co-occurrence into account. We therefore introduce a novel form of distributional inference that exploits the rich type structure in APTs and infers missing data by the same mechanism that is used for semantic composition.Comment: to appear at ACL 2017 (short papers

arXiv.org e-Print Archive

Improving sparse word representations with distributional inference for semantic composition

Author: Kober Thomas
Reffin Jeremy
Weeds Julie
Weir David
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

Distributional models are derived from co- occurrences in a corpus, where only a small proportion of all possible plausible co-occurrences will be observed. This results in a very sparse vector space, requiring a mechanism for inferring missing knowledge. Most methods face this challenge in ways that render the resulting word representations uninterpretable, with the consequence that semantic composition becomes hard to model. In this paper we explore an alternative which involves explicitly inferring unobserved co-occurrences using the distributional neighbourhood. We show that distributional inference improves sparse word repre- sentations on several word similarity benchmarks and demonstrate that our model is competitive with the state-of-the-art for adjective- noun, noun-noun and verb-object compositions while being fully interpretable

Recommended from our members

Analysing trade-offs and synergies between SDGs for urban development, food security and poverty alleviation in rapidly changing peri-urban areas: a tool to support inclusive urban planning

Author: Butcher Bradley
Dolley Jonathan
Eray Baris
Marshall Fiona
Quadrianto Novi
Reffin Jeremy
Robinson James Alexander
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 02/04/2020
Field of study

Transitional peri-urban contexts are frontiers for sustainable development where land-use change involves negotiation and contestation between diverse interest groups. Multiple, complex trade-offs between outcomes emerge which have both negative and positive impacts on progress towards achieving Sustainable Development Goals (SDGs). These trade-offs are often overlooked in policy and planning processes which depend on top-down expert perspectives and rely on course grain aggregate data which does not reflect complex peri-urban dynamics or the rapid pace of change. Tools are required to address this gap, integrate data from diverse perspectives and inform more inclusive planning processes. In this paper, we draw on a reinterpretation of empirical data concerned with land-use change and multiple dimensions of food security from the city of Wuhan in China to illustrate some of the complex trade-offs between SDG goals that tend to be overlooked with current planning approaches. We then describe the development of an interactive web-based tool that implements deep learning methods for fine-grained land-use classification of high-resolution remote sensing imagery and integrates this with a flexible method for rapid trade-off analysis of land-use change scenarios. The development and potential use of the tool are illustrated using data from the Wuhan case study example. This tool has the potential to support participatory planning processes by providing a platform for multiple stakeholders to explore the implications of planning decisions and land-use policies. Used alongside other planning, engagement and ecosystem service mapping tools it can help to reveal invisible trade-offs and foreground the perspectives of diverse stakeholders. This is vital for building approaches which recognise how trade-offs between the achievement of SDGs can be influenced by development interventions

A critique of word similarity as a method for evaluating distributional semantic models

Author: Batchkarov Miroslav
Kober Thomas
Reffin Jeremy
Weeds Julie
Weir David
Publication venue: 'Association for Computational Linguistics (ACL)'
Publication date: 01/01/2016
Field of study

This paper aims to re-think the role of the word similarity task in distributional semantics research. We argue while it is a valuable tool, it should be used with care because it provides only an approximate measure of the quality of a distributional model. Word similarity evaluations assume there exists a single notion of similarity that is independent of a particular application. Further, the small size and low inter-annotator agreement of existing data sets makes it challenging to find significant differences between models

Disrupting Daesh: measuring takedown of online terrorist material and its impacts

Author: Conway Maura
Khawaja Moign
Lakhani Suraj
Reffin Jeremy
Robertson Andrew
Weir David
Publication venue: The Vox-Pol Network of Excellence
Publication date: 01/08/2017
Field of study

This report seeks to contribute to public and policy debates on the value of social media disruption activity with respect to terrorist material. We look in particular at aggressive account and content takedown, with the aim of accurately measuring this activity and its impacts. Our findings challenge the notion that Twitter remains a conducive space for Islamic State (IS) accounts and communities to flourish, although IS continues to distribute propaganda through this channel. However, not all jihadists on Twitter are subject to the same high levels of disruption as IS, and we show that there is differential disruption taking place. IS’s and other jihadists’ online activity was never solely restricted to Twitter. Twitter is just one node in a wider jihadist social media ecology. We describe and discuss this, and supply some preliminary analysis of disruption trends in this area

DCU Online Research Access Service